Biostatistics 304. Cluster analysis.
نویسنده
چکیده
In Cluster analysis, we seek to identify the “natural” structure of groups based on a multivariate profile, if it exists, which both minimises the within-group variation and maximises the between-group variation. The objective is to perform data reduction into manageable bite-sizes which could be used in further analysis or developing hypothesis concerning the nature of the data. It is exploratory, descriptive and non-inferential. This technique will always create clusters, be it right or wrong. The solutions are not unique since they are dependent on the variables used and how cluster membership is being defined. There are no essential assumptions required for its use except that there must be some regard to theoretical/conceptual rationale upon which the variables are selected. For simplicity, we shall use 10 subjects to demonstrate how cluster analysis works. We are interested to group these 10 subjects into complianceon-medication-taking (for example) subgroups basing on four biomarkers, and later to use the clusters to do further analysis – say, to profile compliant vs non-compliant subjects. The descriptives are given in Table I, with higher values being indicative of better compliance.
منابع مشابه
Biostatistics 303. Discriminant analysis.
In this article, it was planned that we shall discuss Discriminant and Cluster analysis. While preparing the discussions for both topics, there was an overwhelming large amount of information and thus we shall concentrate on Discriminant analysis only and leave Cluster analysis to Biostatistics 304. Discriminant analysis (DA) was the traditional statistical technique used for differentiating gr...
متن کاملA likelihood-based approach to mixed modeling with ambiguity in cluster identifiers
This manuscript describes a novel, linear mixed-effects model-fitting technique for the setting in which correlated data indicators are not completely observed. Mixed modeling is a useful analytical tool for characterizing genotype-phenotype associations among multiple potentially informative genetic loci. This approach involves grouping individuals into genetic clusters, where individuals in t...
متن کاملA new algorithm for hybrid clustering of gene expression data with visualization and the bootstrap
An important goal with large-scale gene expression studies is to find biologically important subsets and clusters of genes. In this paper, we propose a hybrid clustering method, Hierarchical Ordered Partitioning And Collapsing Hybrid (HOPACH), which is a hierarchical tree of clusters. The methodology combines the strengths of both partitioning (or divisive) and agglomerative clustering methods....
متن کامل304Methicillin-resistant Staphylococcus aureus in Ohio EMS providers: A statewide cross-sectional study
304. Methicillin-resistant Staphylococcus aureus in Ohio EMS providers: A statewide cross-sectional study Robert Orellana, MPH; Armando Hoet, DVM, PhD; Bo Lu, PhD; Sarah Anderson, PhD; Kurt Stevenson, MD, MPH; Division of Epidemiology, Ohio State University, Columbus, OH; Department of Veterinary Preventive Medicine, Ohio State University, Columbus, OH; Division of Biostatistics, Ohio State Uni...
متن کاملMixture models with multiple levels, with application to the analysis of multifactor gene expression data.
Model-based clustering is a popular tool for summarizing high-dimensional data. With the number of high-throughput large-scale gene expression studies still on the rise, the need for effective data- summarizing tools has never been greater. By grouping genes according to a common experimental expression profile, we may gain new insight into the biological pathways that steer biological processe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Singapore medical journal
دوره 46 4 شماره
صفحات -
تاریخ انتشار 2005